Integrated context-dependent networks in very large vocabulary speech recognition
نویسندگان
چکیده
All the components used in the search stage of speech recognition systems – language model, pronunciation dictionary, context-dependent network, HMM model – can be represented by finite-state labeled networks. To construct real-time recognition systems, it is important to optimize these networks and to efficiently combine them. We present new methods that substantially improve these steps. We show that an efficient recognition network including context-dependent and HMM models can be built using weighted determinization of transducers [6]. We report experiments with a 463,331-word vocabulary North American Business News Task that show a substantial improvement of the recognition speed over our previous method [9]. Furthermore, the size of the integrated context-dependentnetworks constructed can be dramatically reduced using a factoring algorithm that we briefly describe. With our construction, the integrated NAB network contains only about 1:3 times as many arcs as the language model it is constructed from.
منابع مشابه
Full expansion of context-dependent networks in large vocabulary speech recognition
We combine our earlier approach to context-dependent network representation with our algorithm for determinizing weighted networks to build optimized networks for large-vocabulary speech recognition combining an n-gram language model, a pronunciation dictionary and context-dependency modeling. While fullyexpanded networks have been used before in restrictive settings (medium vocabulary or no cr...
متن کاملSpoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting
Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...
متن کاملACID/HNN: clustering hierarchies of neural networks for context-dependent connectionist acoustic modeling
We present the ACID/HNN framework, a principled approach to hierarchical connectionist acoustic modeling in large vocabulary conversational speech recognition (LVCSR). Our approach consists of an Agglomerative Clustering algorithm based on Information Divergence (ACID) to automatically design and robustly estimate Hierarchies of Neural Networks (HNN) for arbitrarily large sets of context-depend...
متن کاملHierarchies of neural networks for connectionist speech recognition
We present a principled framework for context-dependent hierarchical connectionist HMM speech recognition. Based on a divideand-conquer strategy, our approach uses an Agglomerative Clustering algorithm based on Information Divergence (ACID) to automatically design a soft classi er tree for an arbitrary large number of HMM states. Nodes in the classi er tree are instantiated with small estimator...
متن کاملLarge Vocabulary Search Space Reduction Employing Directed Acyclic Word Graphs and Phonological Rules
Some applications of speech recognition, such as automatic directory information services, require very large vocabularies. In this paper, we focus on the task of recognizing surnames in an Interactive telephonebased Directory Assistance Services (IDAS) system, which supersedes other large vocabulary applications in terms of complexity and vocabulary size. We present a method for building compa...
متن کامل